Skip to content

Conversation

@flippmoke
Copy link
Member

Remove the class based approach, provide better ways to use existing string objects. Fixed a variety of other small issues.

@flippmoke flippmoke requested a review from joto March 5, 2019 20:29
Copy link

@joto joto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some detailed comments below. Most of what I say about the compression side applies equally to the decompression side.

The biggest problem I see is with the many overloads of the compress() and decompress() functions. I am not sure the new interface is better than the old class-based one.

constexpr std::size_t max_step = static_cast<std::size_t>(std::numeric_limits<unsigned int>::max());
unsigned int step_size = size > max_step ? max_step : static_cast<unsigned int>(size);
size -= step_size;
unsigned int buffer_size = buffering_size > step_size ? step_size : static_cast<unsigned int>(buffering_size);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there tests that all these calculations and casting creates the right results for 32bit and 64bit systems?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I do not have tests for this currently, I should probably add them.

}
#pragma GCC diagnostic pop

std::string buffer;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the extra buffer here? Can't we put the result in output directly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used a buffer here rather than directly writing to output because we need to resize at the end and if multiple loops are required we will be moving a much larger string rather then just reusing the existing buffer and appending it. In the old code this was done by using output.resize(...) followed by another output.resize(...) to shrink the size if necessary.

The problem with my solution is that it might allocate more memory than required though if a buffer size too large is selected. I will continue to think about this.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allocations/deallocations are expensive, copying the buffer probably less so. Using an extra buffer will give you more allocations. Not sure how the different sizes of allocations will impact the performance here. In the end you'd have to benchmark to decide this.

std::size_t size,
std::string& output,
int level = Z_DEFAULT_COMPRESSION,
std::size_t buffering_size = 0)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the different overloads of the compress() function are somewhat confusing and I wonder if they are all correct and unique. The compiler will happily convert between int and size_t and the last parameters have a default value, so I fear in some situation the wrong overload could be chosen or the compiler would detect an ambiguity making the library hard to use. The same applies to the decompress() function below. The original class-based approach had the advantage at least that the configuration parameters were set at construction of the Compressor class.

output.append(buffer, 0, static_cast<std::size_t>(buffer_size - deflate_s.avail_out));
} while (deflate_s.avail_out == 0);
} while (size > 0);
deflateEnd(&deflate_s);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Return code of deflateEnd is not checked.
  2. If this function throws an exception, deflateEnd is not called and there is a memory leak.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only places I could see an exception being thrown would be from std::string::resize or std::string::append. Do you think its worth wrapping it all in a large try catch?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could argue that you will only get exceptions if you run out of memory, in which case you are screwed anyways. It would be cleaner to do it correctly, but at least it should be documented, so the next person to change the code knows that this isn't an oversight but a design decision.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants